This notebook will include informal meta-analyses of different metrics and methods for evaluating surgical skill.
The reported metrics compare differences between novices and expert surgeons.
It is informal because it’s not based on systematic review, and because some studies have been included with very relaxed conditions. For example, I have picked the novices and experts without comparing their definitions between studies. Novice = weakest skill group in the study, expert = strongest skill group in the study. If a study included more than 2 groups, I picked the weakest (=novice) and strongest (=expert) groups’ results and discarded the others. If a study included more than 1 task, or several sub-tasks, I picked the one with largest difference between groups.
Many papers did report means and standard deviations explicitly, so they had to be estimated from boxplots/barplots, or by some other means
For example, sometimes studies reported only mean or median, but no SE/SD. I estimated the SD/SE in those cases based e.g. on the SD of some other similar metric that they reported, or the SD of previous results for the same metric. See the excel file for notes on each study.
May or may not be turned into more systematic meta-analysis later.
Example metrics that will be most likely included (Bolded ones have priority)
Full list of papers and metrics can be found in the excel file shared in the repo:
Last update: 19.7.2022.: Added more results. Changed Laparoscopy -> Endoscopy, so all endoscopic procedures are labeled ‘endoscopy’
If you notice errors or know some good studies to be included, feel free to forward them to
jani.koskinen [ at ] uef.fi
or use the form below TBD
These values are used as input in the R meta package’s metagen function.
For more information, check:
Forest plot explanation
Some general statistics of the studies included:
Number of unique studies: 88
Number of studies by surgical technique:
| Var1 | Freq |
|---|---|
| Endoscopy | 44 |
| Microsurgery | 14 |
| Open Surgery | 12 |
| Radiography | 1 |
| Robotic Surgery | 8 |
Number of studies by metric:
| Technique | Count |
|---|---|
| task_time | 35 |
| tool_path_length | 24 |
| tool_velocity | 16 |
| tool_idle | 8 |
| tool_movements | 16 |
| tool_jerk | 14 |
| tool_acceleration | 8 |
| tool_bimanual | 7 |
| pupil_dilation | 7 |
| tool_force | 12 |
| scale_OSATS | 9 |
How many samples needed at some effect size d? At alpha = 0.05 and power = 0.8 and using t-test. Assuming independent trials (e.g. no multiple measurements from same participants etc.)
Hover mouse over the points in the plot to see the values. Sample size is for group, so you need this many samples per group
Some baseline effect sizes from the meta-analyses given as baseline:
IT = Idle Time
TT = Task Time
BD = Bimanual Dexterity
TEPR = Task-Evoked Pupil Reaction/Dilation (Esimated without one outlier study removed)
TJ = Tool Jerk
TF = Tool Force